AITopics | time-series dataset

Collaborating Authors

time-series dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

949b3011c50300a2b4e60377466f52a8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 20:47:35 GMT

We also propose a simple method to convert these bounds and other similar ones in traditional deep learning and machine learning to Bayesian counterparts for both i.i.d. and Markov datasets.

artificial intelligence, generalization error, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Measuring Time-Series Dataset Similarity using Wasserstein Distance

Chen, Hongjie, Mehra, Akshay, Kimball, Josh, Rossi, Ryan A.

arXiv.org Artificial IntelligenceJul-31-2025

The emergence of time-series foundation model research elevates the growing need to measure the (dis)similarity of time-series datasets. A time-series dataset similarity measure aids research in multiple ways, including model selection, finetuning, and visualization. In this paper, we propose a distribution-based method to measure time-series dataset similarity by leveraging the Wasserstein distance. We consider a time-series dataset an empirical instantiation of an underlying multivariate normal distribution (MVN). The similarity between two time-series datasets is thus computed as the Wasserstein distance between their corresponding MVNs. Comprehensive experiments and visualization show the effectiveness of our approach. Specifically, we show how the Wasserstein distance helps identify similar time-series datasets and facilitates inference performance estimation of foundation models in both out-of-distribution and transfer learning evaluation, with high correlations between our proposed measure and the inference loss (>0.60).

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.22189

Country: North America > United States (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Using matrix-product states for time-series machine learning

Moore, Joshua B., Stackhouse, Hugo P., Fulcher, Ben D., Mahmoodian, Sahand

arXiv.org Machine LearningDec-20-2024

Matrix-product states (MPS) have proven to be a versatile ansatz for modeling quantum many-body physics. For many applications, and particularly in one-dimension, they capture relevant quantum correlations in many-body wavefunctions while remaining tractable to store and manipulate on a classical computer. This has motivated researchers to also apply the MPS ansatz to machine learning (ML) problems where capturing complex correlations in datasets is also a key requirement. Here, we develop and apply an MPS-based algorithm, MPSTime, for learning a joint probability distribution underlying an observed time-series dataset, and show how it can be used to tackle important time-series ML problems, including classification and imputation. MPSTime can efficiently learn complicated time-series probability distributions directly from data, requires only moderate maximum MPS bond dimension $\chi_{\rm max}$, with values for our applications ranging between $\chi_{\rm max} = 20-150$, and can be trained for both classification and imputation tasks under a single logarithmic loss function. Using synthetic and publicly available real-world datasets, spanning applications in medicine, energy, and astronomy, we demonstrate performance competitive with state-of-the-art ML approaches, but with the key advantage of encoding the full joint probability distribution learned from the data. By sampling from the joint probability distribution and calculating its conditional entanglement entropy, we show how its underlying structure can be uncovered and interpreted. This manuscript is supplemented with the release of a publicly available code package MPSTime that implements our approach. The efficiency of the MPS-based ansatz for learning complex correlation structures from time-series data is likely to underpin interpretable advances to challenging time-series ML problems across science, industry, and medicine.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2412.15826

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Time is Not Enough: Time-Frequency based Explanation for Time-Series Black-Box Models

Chung, Hyunseung, Jo, Sumin, Kwon, Yeonsu, Choi, Edward

arXiv.org Artificial IntelligenceAug-12-2024

Despite the massive attention given to time-series explanations due to their extensive applications, a notable limitation in existing approaches is their primary reliance on the time-domain. This overlooks the inherent characteristic of time-series data containing both time and frequency features. In this work, we present Spectral eXplanation (SpectralX), an XAI framework that provides time-frequency explanations for time-series black-box classifiers. This easily adaptable framework enables users to "plug-in" various perturbation-based XAI methods for any pre-trained time-series classification models to assess their impact on the explanation quality without having to modify the framework architecture. Additionally, we introduce Feature Importance Approximations (FIA), a new perturbation-based XAI method. These methods consist of feature insertion, deletion, and combination techniques to enhance computational efficiency and class-specific explanations in time-series classification tasks. We conduct extensive experiments in the generated synthetic dataset and various UCR Time-Series datasets to first compare the explanation performance of FIA and other existing perturbation-based XAI methods in both time-domain and time-frequency domain, and then show the superiority of our FIA in the time-frequency domain with the SpectralX framework. Finally, we conduct a user study to confirm the practicality of our FIA in SpectralX framework for class-specific time-frequency based time-series explanations. The source code is available in https://github.com/gustmd0121/Time_is_not_Enough

classifier, dataset, explanation, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3627673.3679844

2408.03636

Country:

North America > United States > Idaho > Ada County > Boise (0.06)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

WEATHER-5K: A Large-scale Global Station Weather Dataset Towards Comprehensive Time-series Forecasting Benchmark

Han, Tao, Guo, Song, Chen, Zhenghao, Xu, Wanghan, Bai, Lei

arXiv.org Machine LearningJun-20-2024

Global Station Weather Forecasting (GSWF) is crucial for various sectors, including aviation, agriculture, energy, and disaster preparedness. Recent advancements in deep learning have significantly improved the accuracy of weather predictions by optimizing models based on public meteorological data. However, existing public datasets for GSWF optimization and benchmarking still suffer from significant limitations, such as small sizes, limited temporal coverage, and a lack of comprehensive variables. These shortcomings prevent them from effectively reflecting the benchmarks of current forecasting methods and fail to support the real needs of operational weather forecasting. To address these challenges, we present the WEATHER-5K dataset. This dataset comprises a comprehensive collection of data from 5,672 weather stations worldwide, spanning a 10-year period with one-hour intervals. It includes multiple crucial weather elements, providing a more reliable and interpretable resource for forecasting. Furthermore, our WEATHER-5K dataset can serve as a benchmark for comprehensively evaluating existing well-known forecasting models, extending beyond GSWF methods to support future time-series research challenges and opportunities.

dataset, forecasting, weather station, (13 more...)

arXiv.org Machine Learning

2406.14399

Country:

Asia > China > Shanghai > Shanghai (0.04)
Oceania > Australia (0.04)
South America > Argentina (0.04)
(33 more...)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Food & Agriculture > Agriculture (0.66)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Large Pre-trained time series models for cross-domain Time series analysis tasks

Kamarthi, Harshavardhan, Prakash, B. Aditya

arXiv.org Artificial IntelligenceNov-19-2023

Large pre-trained models have been instrumental in significant advancements in domains like language and vision making model training for individual downstream tasks more efficient as well as provide superior performance. However, tackling time-series analysis tasks usually involves designing and training a separate model from scratch leveraging training data and domain expertise specific to the task. We tackle a significant challenge for pre-training a general time-series model from multiple heterogeneous time-series dataset: providing semantically useful inputs to models for modeling time series of different dynamics from different domains. We observe that partitioning time-series into segments as inputs to sequential models produces semantically better inputs and propose a novel model LPTM that automatically identifies optimal dataset-specific segmentation strategy leveraging self-supervised learning loss during pre-training. LPTM provides performance similar to or better than domain-specific state-of-art model and is significantly more data and compute efficient taking up to 40% less data as well as 50% less training time to achieve state-of-art performance in a wide range of time-series analysis tasks from multiple disparate domains. Time-series analysis tasks involve important well-studied problems involving time-series datasets such as forecasting (Hyndman & Athanasopoulos, 2018) and classification (Chowdhury et al., 2022) with applications in wide-ranging domains such as retail, meteorology, economics, and health. Recent works (Chen et al., 2021; Wang et al., 2022; Zeng et al., 2023) have shown the efficacy of purely data-driven deep learning models in learning complex domain-specific properties of the time series over traditional statistic and mechanistic models across many domains. However, coming up with a model for a specific application or time-series analysis task is usually non-trivial.

baseline, dataset, forecasting, (16 more...)

arXiv.org Artificial Intelligence

2311.11413

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Asia > Japan (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
Banking & Finance > Trading (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Feature-Based Time-Series Analysis in R using the theft Package

Henderson, Trent, Fulcher, Ben D.

arXiv.org Artificial IntelligenceJul-3-2023

Time series are measured and analyzed across the sciences. One method of quantifying the structure of time series is by calculating a set of summary statistics or `features', and then representing a time series in terms of its properties as a feature vector. The resulting feature space is interpretable and informative, and enables conventional statistical learning approaches, including clustering, regression, and classification, to be applied to time-series datasets. Many open-source software packages for computing sets of time-series features exist across multiple programming languages, including catch22 (22 features: Matlab, R, Python, Julia), feasts (42 features: R), tsfeatures (63 features: R), Kats (40 features: Python), tsfresh (779 features: Python), and TSFEL (390 features: Python). However, there are several issues: (i) a singular access point to these packages is not currently available; (ii) to access all feature sets, users must be fluent in multiple languages; and (iii) these feature-extraction packages lack extensive accompanying methodological pipelines for performing feature-based time-series analysis, such as applications to time-series classification. Here we introduce a solution to these issues in an R software package called theft: Tools for Handling Extraction of Features from Time series. theft is a unified and extendable framework for computing features from the six open-source time-series feature sets listed above. It also includes a suite of functions for processing and interpreting the performance of extracted features, including extensive data-visualization templates, low-dimensional projections, and time-series classification operations. With an increasing volume and complexity of time-series datasets in the sciences and industry, theft provides a standardized framework for comprehensively quantifying and interpreting informative structure in time series.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2208.06146

Country:

Oceania > Australia (0.04)
North America > United States > New York (0.04)
North America > United States > Florida > Miami-Dade County > Miami Beach (0.04)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Machine Learning for Time-Series with Python: Forecast, predict, and detect anomalies with state-of-the-art machine learning methods: Ben Auffarth: 9781801819626: Amazon.com: Books

#artificialintelligenceJan-1-2022, 10:45:48 GMT

Become proficient in deriving insights from time-series data and analyzing a model's performance Machine learning has emerged as a powerful tool to understand hidden complexities in time-series datasets, which frequently need to be analyzed in areas as diverse as healthcare, economics, digital marketing, and social sciences. These datasets are essential for forecasting and predicting outcomes or for detecting anomalies to support informed decision making. This book covers Python basics for time-series and builds your understanding of traditional autoregressive models as well as modern non-parametric models. You will become confident with loading time-series datasets from any source, deep learning models like recurrent neural networks and causal convolutional network models, and gradient boosting with feature engineering. Machine Learning for Time-Series with Python explains the theory behind several useful models and guides you in matching the right model to the right problem.

machine learning, state-of-the-art machine, time-series dataset, (5 more...)

#artificialintelligence

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hands-On Guide To Darts - A Python Tool For Time Series Forecasting

#artificialintelligenceNov-3-2020, 00:15:23 GMT

Data collected over a certain period of time is called Time-series data. These data points are usually collected at adjacent intervals and have some correlation with the target. There are certain datasets that contain columns with date, month or days that are important for making predictions like sales datasets, stock price prediction etc. But the problem here is how to use the time-series data and convert them into a format the machine can understand? Python made this process a lot simpler by introducing a package called Darts.

darts, dataset, library, (11 more...)

#artificialintelligence

Country: Oceania > Australia (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Add feedback

A Periodicity-based Parallel Time Series Prediction Algorithm in Cloud Computing Environments

Chen, Jianguo, Li, Kenli, Rong, Huigui, Bilal, Kashif, Li, Keqin, Yu, Philip S.

arXiv.org Machine LearningOct-17-2018

In the era of big data, practical applications in various domains continually generate large-scale time-series data. Among them, some data show significant or potential periodicity characteristics, such as meteorological and financial data. It is critical to efficiently identify the potential periodic patterns from massive time-series data and provide accurate predictions. In this paper, a Periodicity-based Parallel Time Series Prediction (PPTSP) algorithm for large-scale time-series data is proposed and implemented in the Apache Spark cloud computing environment. To effectively handle the massive historical datasets, a Time Series Data Compression and Abstraction (TSDCA) algorithm is presented, which can reduce the data scale as well as accurately extracting the characteristics. Based on this, we propose a Multi-layer Time Series Periodic Pattern Recognition (MTSPPR) algorithm using the Fourier Spectrum Analysis (FSA) method. In addition, a Periodicity-based Time Series Prediction (PTSP) algorithm is proposed. Data in the subsequent period are predicted based on all previous period models, in which a time attenuation factor is introduced to control the impact of different periods on the prediction results. Moreover, to improve the performance of the proposed algorithms, we propose a parallel solution on the Apache Spark platform, using the Streaming real-time computing module. To efficiently process the large-scale time-series datasets in distributed computing environments, Distributed Streams (DStreams) and Resilient Distributed Datasets (RDDs) are used to store and calculate these datasets. Extensive experimental results show that our PPTSP algorithm has significant advantages compared with other algorithms in terms of prediction accuracy and performance.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1810.07776

Country:

North America > United States (0.92)
Asia (0.68)

Genre: Research Report > New Finding (0.48)

Industry:

Banking & Finance > Trading (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback